FASE: A Framework for Scalable Performance Prediction of HPC Systems and Applications
نویسندگان
چکیده
As systems of computers become more complex in terms of their architecture, interconnect and het-erogeneity, the optimum configuration and utilization of these machines becomes a major challenge. To reduce the penalties caused by poorly configured systems, simulation is often used to predict the performance of key applications to be executed on the new systems. Simulation provides the capability to observe component and system characteristics (e.g. performance and power) in order to make vital design decisions. However, simulating high-fidelity models can be very time consuming and even prohibitive when evaluating large-scale systems. The Fast and Accurate Simulation Environment (FASE) framework seeks to support large-scale system simulation by using high-fidelity models to capture the behavior of only the performance-critical components while employing abstraction techniques to capture the effects of those components with little impact on the system. In order to achieve this balance of accuracy and simulation speed, FASE provides a methodology and associated toolset to evaluate numerous architectural options. This approach allows users to make system design decisions based on quantifiable demands of their key applications rather than using manual analysis which can be error prone and impractical for large systems. The framework accomplishes this evaluation through a novel approach of combining discrete-event simulation with an application characterization scheme in order to remove unnecessary details while focusing on components critical to the performance of the application. In this paper, we present the methodology and techniques behind FASE and include several case studies validating systems constructed using various applications and interconnects.
منابع مشابه
Early Prediction of the Cost of Cloud Usage for HPC Applications
After a decade of diffusion, cloud computing has received wide acceptance, but it is not yet attractive for the HPC community. Clouds could be a cost-effective alternative to clusters and supercomputers, providing economy of scale, elasticity, flexibility, and easy customization. Unfortunately, most clouds are optimized for running business applications, not for HPC. However, they can be profit...
متن کاملData Replication-Based Scheduling in Cloud Computing Environment
Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...
متن کاملEnergy Efficiency of Parallel Multicore Programs
The increasing energy consumption of large-scale high performance resources raises technical and economical concerns. A reduction of consumed energy in multicore systems is possible to some extent with an optimized usage of computing and memory resources that is tailored to specific HPC applications. The essential step towards more sustainable consumption of energy is its reliable measurements ...
متن کاملSONAR: Automated Communication Characterization for HPC Applications
Future computing systems will need to operate within hard power and energy constraints, this is particularly true for Exascale-class systems. These constraints are hard for technical, economical and ecological reasons, thus, such systems have to operate within given power and energy budgets. Therefore, we anticipate the need for modeling tools that help to predict power and energy consumption. ...
متن کاملApplying an Automated Framework to Produce Accurate Blind Performance Predictions of Full-Scale HPC Applications
This work builds on an existing performance modeling framework that has been proven effective on a variety of HPC systems. This paper will illustrate the framework’s power by creating blind predictions for three systems as well as establishing sensitivity studies to advance understanding of observed and anticipated performance of both architecture and application. The predictions are termed bli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Simulation
دوره 83 شماره
صفحات -
تاریخ انتشار 2007